Chennai Reservoir levels

Chennai Rainfall

Water Quality

Omitting NA Values

Changing Character Values to Numeric Datatype

Replacing the NA Values from it's mean and median Value

Cleaning of Outlayered Data for reservoir level

Exploratory Data Analysis

For water level and rain level histogram plots:

x-axis: water/rain level
y-axis: count(frequency)

Note: Inferences for the plots are given below them wherever necessary.

POONDI Analysis (water level and rainfall)

We can observe that even with less rainfall for the year 2020, it has got somewhat better water level than 2019. This could be the result of action taken by the concerned authorities to replenish the water level with other resources available. Also some amount of rainfall in month of JAN & APRIL accounts to it.

Rainfall for the region has been from the mid year till the end. Amount of rainfall is not good but the rainfall level each month accounts to the water level maintenance. It also shows the decrease in rainfall level in recent years.

We can observe the outliers present in the water level data for different months in POONDI region. August and september have got outliers in data. Also we can observe the highest water level in the month of December.

With the increasing years, water level for POONDI reservoir has been decreasing. This shows that the proper replenishing of the water level is not met.

Good Rainfall has never been a case for Chennai but in recent times the frequency of the rainfall has also decreased. From the graph its clear that, earlier the frequency of rainfall was more even with less rainfall level.

Observing the water level trend for the last year, we can infer that even during the monsoon seasons (July-Sept) the water level is not good. For only a few instances, it is close to 3000 during the month of December.

The rainfall for the monsoon season last year has been bad. The region has been receiving very less rainfall. There has been some amount of rainfall for Nov, Dec which has replenished the water level to some extent.

CHOLAVARAM Analysis (water level and rainfall)

We can observe the same here, the serious decline in water and rainfall level over the years. Also the rainfall for this region as well is form mid year till the end.

We can observe the decline in water level over the years. In the recent years the region has got the least water level. This region's condition is even worse than the POONDI region.

We can observe the decrease in rainfall level over the years, but the decrease in frequency of rainfall is more of a concern. Even Small amounts of rainfall can account to maintenance of water level but this region is hit even harder in terms of rainfall.

We can observe the results of water level for this drastically hit region here. It is least for the most part of the year, some rainfall in last 3 months have brought slight increase in the water level.

Most of the rainfall for previous year has been in the month of November. Also we can see more frequent but small amount of rainfall for July. Low or no rainfall for others months justifies the water level of the region.

REDHILLS Analysis (water level and rainfall)

In the REDHILLS Region the amount of water level in the reservoir has been almost constant. But in the month of April the required amount of water level is increased and the month of September to October it needs less amount of water due to heavy amount of rainfall.

From the above result we can infer that 2020 resulted in less amount of rainfall and therefore to accommodate the water crisis, the water level in the reservoir is increased to a comparable amount and therefore the requirement of water at a certain level is fulfilled.

The year 2012 marked the decline in the water levels of Redhill reservoir and since then, it has been decreasing. However, we can see a significant change in the year 2020.

The rain levels have significantly reduced since 2015 in the Redhill area. We can see how unevenly the rainfall distribution is over different months. Except for the last 2 months, the rain levels have become dropped down.

From the above results we can infer that the month of April shows the highest amount of water level and October shows the least range of water level requirement in the year 2020. And therefore, the main reason for water crisis can be poor replenishment of water level in the reservoir.

From the above graph it is clearly stated that rainfall level is highest in the month of October, and December shows the lowest amount of rainfall in the year 2020.

CHEMBARAMBAKKAM Analysis (water level and rainfall)

From the above graph the area of Chembarambakkam shows it's highest amount of water level in the year 2016 and in the month of October to November and therefore results in proper replenishment of Water reservoir.

From the above result the area of Chembarambakkam faces a great amount of downfall in rainfall level after the year of 2016 and therefore is not a good sign of proper amount of water availability.

We found that that Chembarambakkam was worst hit during 2019 with the water levels dropping below 5000. However, it recovered for the loss in 2020.

Just like all other reservoirs, Chembarambakkam received its highest rainfall during November with significantly lower rain in other months of 2020.

From the above result of water level in reservoir in the year 2020, month of November has the highest amount of water level in the reservoir and June gives the least amount of water level reservoir level in the year 2020.

The month of November in the year 2020 shows the highest amount of rainfall and therefore there is less amount of chances of facing any water crisis during that time since the level of water in reservoir and the rainfall level are equally related.

Comparison of all four

Overall rainfall has decreased in this region in comparison to the previous year. Most rainfall has occurred during the monsoon season for 2019 while for 2020 its during November.

Overall rainfall has decreased in this region in comparison to the previous year. Most rainfall has occurred during October in 2019 while for 2020 its during November.

Overall rainfall has increased in this region in comparison to the previous year. Most rainfall has occurred during October in 2019 while for 2020 its during November and also decent amount in October which resulted in increase in overall rainfall level.

Overall rainfall has decreased in this region in comparison to the previous year. Most rainfall has occurred during October in 2019 while for 2020 its during November.

Water level in this region has increased from the previous years. And the water level is constant for most of the months of the year 2020.

Water level in this region has increased from the previous year but it is very low throughout the year except for the month of November in 2020.

Water level in this region has increased drastically from the previous year. This can be justified by the increase in the rainfall level in this region. Also its constant throughout the year which shows balanced usage and replenishment.

Water level in this region has increased drastically from the previous year. And the water level is constant throughout the year 2020.

Machine Learning Model for Chennai Reservoir dataset

To determine the optimal number of clusters, selecting the value of k at the elbow, i.e. the point after which distortion starts decreasing in a linear fashion.

Analysis and Modelling of Water Level Dataset

After determining the number of clusters the application of k-means clustering is applied to the dataset to check on the groups that has not been explicitly labeled in the data. This clustering analysis results in finding the subgroups of the samples based on the features of the dataset and this algorithm solves the problem of Expectation maximization. Therefore from the result we take that to display the subgroups of the water level dataset the number of clusters to be 2.

Applying fit model that will give the clustering model of all the regions with respect to Day and therefore predict which region gets the highest amount of rainfall and which region has the highest amount of water level reservoir and therefore from the above results the region which gets the highest amount of rainfall is REDHILLS and the region with highest amount water level reservoir is in the POONDI region.

Modelling Statistics of Water reservoir with respect to different regions